ggplot2 Exercises - Solutions

Recreate the following plots shown below. Don't worry if your plots don't match exactly what is shown below, as long as you have a general understanding of ggplot2 and the grammar of graphics


Important Note!

Some of the images may be distorted from the conversion to a web format


For the first few plots, use the mpg dataset

In [21]:
library(ggplot2)
library(ggthemes)
head(mpg)
Out[21]:
manufacturermodeldisplyearcyltransdrvctyhwyflclass
1audia41.819994auto(l5)f1829pcompact
2audia41.819994manual(m5)f2129pcompact
3audia4220084manual(m6)f2031pcompact
4audia4220084auto(av)f2130pcompact
5audia42.819996auto(l5)f1626pcompact
6audia42.819996manual(m5)f1826pcompact

Histogram of hwy mpg values:

In [11]:
ggplot(mpg,aes(x=hwy)) + geom_histogram(bins=20,fill='red',alpha=0.5)

Barplot of car counts per manufacturer with color fill defined by cyl count

In [23]:
ggplot(mpg,aes(x=manufacturer)) + geom_bar(aes(fill=factor(cyl))) + theme_gdocs()

Switch now to use the txhousing dataset that comes with ggplot2

In [54]:
head(txhousing)
Out[54]:
cityyearmonthsalesvolumemedianlistingsinventorydate
1Abilene20001725380000714007016.32000
2Abilene20002986505000587007466.62000.083
3Abilene200031309285000581007846.82000.167
4Abilene20004989730000686007856.92000.25
5Abilene2000514110590000673007946.82000.333
6Abilene2000615613910000669007806.62000.417

Create a scatterplot of volume versus sales. Afterwards play around with alpha and color arguments to clarify information.

In [74]:
pl <- ggplot(txhousing,aes(x=sales,y=volume)) + geom_point(color='blue',alpha=0.5)
print(pl)
Warning message:
: Removed 568 rows containing missing values (geom_point).

Add a smooth fit line to the scatterplot from above. Hint: You may need to look up geom_smooth()

In [87]:
pl + geom_smooth(color='red')
Warning message:
: Removed 568 rows containing non-finite values (stat_smooth).Warning message:
: Removed 568 rows containing missing values (geom_point).

Great Job!

Up next you'll have a data visualization project in which you will build up a real data visualization used in The Economist.